Search CORE

38 research outputs found

Text Extraction and Web Searching in a Non-Latin Language

Author: Lazarinis Fotis
Publication venue
Publication date
Field of study

Recent studies of queries submitted to Internet Search Engines have shown that non-English queries and unclassifiable queries have nearly tripled during the last decade. Most search engines were originally engineered for English. They do not take full account of inflectional semantics nor, for example, diacritics or the use of capitals which is a common feature in languages other than English. The literature concludes that searching using non-English and non-Latin based queries results in lower success and requires additional user effort to achieve acceptable precision. The primary aim of this research study is to develop an evaluation methodology for identifying the shortcomings and measuring the effectiveness of search engines with non-English queries. It also proposes a number of solutions for the existing situation. A Greek query log is analyzed considering the morphological features of the Greek language. Also a text extraction experiment revealed some problems related to the encoding and the morphological and grammatical differences among semantically equivalent Greek terms. A first stopword list for Greek based on a domain independent collection has been produced and its application in Web searching has been studied. The effect of lemmatization of query terms and the factors influencing text based image retrieval in Greek are also studied. Finally, an instructional strategy is presented for teaching non-English students how to effectively utilize search engines. The evaluation of the capabilities of the search engines showed that international and nationwide search engines ignore most of the linguistic idiosyncrasies of Greek and other complex European languages. There is a lack of freely available non-English resources to work with (test corpus, linguistic resources, etc). The research showed that the application of standard IR techniques, such as stopword removal, stemming, lemmatization and query expansion, in Greek Web searching increases precision. i

Sunderland University Institutional Repository

Preserving Cultural Heritage Using Open Source Collection Management Tools

Author: Gkoumas Georgios
Lazarinis Fotis
Publication venue: Institute of Mathematics and Informatics Bulgarian Academy of Sciences
Publication date: 01/01/2013
Field of study

Open source software (OSS) popularity is growing steadily and many OSS systems could be used to preserve cultural heritage objects. Such solutions give the opportunity to organizations to afford the development of a digital collection. This paper focuses on reviewing two OSS tools, CollectionSpace and the Open Video Digital Library Toolkit and discuss on how these could be used for organizing digital replicas of cultural objects. The features of the software are presented and some examples are given

Bulgarian Digital Mathematics Library at IMI-BAS

Digitization and Preservation of City Landmarks Using Limited and Free Web Services

Author: Georgiou Sotiris
Lazarinis Fotis
Publication venue: Institute of Mathematics and Informatics Bulgarian Academy of Sciences
Publication date: 01/01/2013
Field of study

This paper presents a practical approach for digitizing city landmarks based on free and limited Web resources. The digital replicas are then placed on the Web using popular services, like Google earth, and are accessible to a huge user base. The method is easily applicable and quite valuable to organizations with limited funding

Bulgarian Digital Mathematics Library at IMI-BAS

Creating personalized assessments based on learner knowledge and objectives in a hypermedia Web testing application

Author: Abdullah Chua
Brusilovsky
Brusilovsky
Conejo
De Bra
Elaine Pearson
Eliot
Fotis Lazarinis
GRE Exam
Huang
IMS QTI
Kehoe
Lazarinis
Lee
Leung
León
Lord
Microsoft CAT
Sitthisak
Sosnovsky
Steve Green
Thissen
Topic Maps
van der Linden
Wainer
Wainer
Weber
Weber
Wise
Publication venue: 'Elsevier BV'
Publication date: 01/12/2010
Field of study

Crossref

Teeside University's Research Repository

Online risks obstructing safe internet access for students

Author: Fotis Lazarinis
Publication venue: 'Emerald'
Publication date
Field of study

Crossref

Combining Information Retrieval with Information Extraction for Efficient Retrieval of Calls for Papers

Author: Fotis Lazarinis
Publication venue
Publication date
Field of study

In many domains there are specific attributes in documents that carry more weight than the general words in the document. This paper proposes the use of information extraction techniques in order to identify these attributes for the domain of calls for papers. The utilisation of attributes into queries imposes new requirements on the retrieval method of conventional information retrieval systems. A new model for estimating the relevance of documents to user requests is also presented. The effectiveness of this model and the benefits of integrating information extraction with information retrieval are shown by comparing our system with a typical information retrieval system. The results show a precision increase of between 45% and 60% of all recall points. 1 Introduction Information retrieval (IR) systems, also called text retrieval systems, facilitate users to retrieve information which is relevant or close to their information needs. Even though specific words may be key attr..

CiteSeerX

Crossref